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(1) Real Party in Interest 

A statement identifying by name the real party in interest is contained in the brief. 

(2) Related Appeals and Interferences 

The examiner is not aware of any related appeals, interferences, or judicial proceedings 
which will directly affect or be directly affected by or have a bearing on the Board's decision in 
the pending appeal. 

(3) Status of Claims 

The statement of the status of claims contained in the brief is correct. 

(4) Status of Amendments After Final 

The appellant's statement of the status of amendments after final rejection contained in 
the brief is correct. 

(5) Summary of Claimed Subject Matter 

The summary of claimed subject matter contained in the brief is correct. 

(6) Grounds of Rejection to be Reviewed on Appeal 

The appellant's statement of the grounds of rejection to be reviewed on appeal is correct. 

(7) Claims Appendix 

The copy of the appealed claims contained in the Appendix to the brief is correct. 

(8) Evidence Relied Upon 

6,079,566 . Eleftheriadis et al 6-2000 
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(9) Grounds of Rejection 

The following ground(s) of rejection are applicable to the appealed claims: 

Claim Rejections - 35 USC § 102 

1 . The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 

form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public use or. on 
sale in this country, more than one year prior to the date of application for patent in the United States. 

Claims 1-43 are rejected under 35 U.S.C. 102(b) as being anticipate by US Pat No 
6,079 issued to Eleftheriadis et al (hereafter Pat *566). 
Claims L 17 and 33: 
Pat '566 discloses: 

(a) at least one multimedia information input interface receiving said multimedia 
information [MPEG-4 file, Fig 4, electronic memory 390, col 7, line 25] 

(b) a computer processor coupled to said at least one multimedia information input 
interface receiving said multimedia information therefrom [Fig 4, CPU 380], 

processing said multimedia information by performing object extraction 
processing to generate multimedia object descriptions from said multimedia information 
[col 1, lines 64-67, col 3, lines 30-37 discloses "encodes, stores and retrieves not just 
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frames but individual segments containing AV objects which are then assembled into a 
scene according to embedded file information," abstract discloses that AV objects can 
be accessed using index information, col 5, lines 10-12 discloses Object IDs to uniquely 
identify AV objects] 

processing said generated multimedia object descriptions by object hierarchy 
processing to generate multimedia object hierarchy descriptions indicative of an 
organization of said object descriptions [col 3, lines 35-40, tree-structured approach] 

wherein at least one description record including said multimedia object 
descriptions and said multimedia object hierarchy descriptions is generated for content 
embedded within said multimedia information [file contains a header having streaming 
information, physical object information and logical object information] 
( c ) a data storage system, operatively coupled to said processor for storing at least said 
at least one description record [Fig 4, MPEG-4 player 360, video buffer col 7, lines 40- 
42] 

Claims 2, 18 and 34: 

Pat '566 discloses wherein said multimedia information comprises image information, 
said multimedia object descriptions comprise image object descriptions, and said 
multimedia object hierarchy descriptions comprise image object hierarchy descriptions, 
[col 3, lines 35-40, tree-structured approach] 
Claims 3 and 19: 

Pat '566 discloses (a) image segmentation processing to segment each image in said 
image information into regions within said image, and (b) feature extraction processing 
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to generate one or more feature descriptions for one or more of said regions, whereby 
said generated object descriptions comprise said one or more feature descriptions for 
one or more of said regions [col 3, lines 20-28] 
Claims 4. 20 and 35: 

Pat '566 discloses wherein said one or more feature descriptions are selected from the 
group consisting of text annotations, color, texture, shape size and position [col 30-40] 
Claims 5. 21 and 36: 

Pat '566 discloses wherein said object hierarchy processing comprises physical object 
hierarchy organization to generate physical object hierarchy descriptions of said image 
object descriptions that are based on spatial characteristics of said objects, such that said 
image object hierarchy descriptions comprise physical descriptions [col 30-40]. 
Claims 6. 22 and 37: 

Pat '566 discloses wherein said object hierarchy processing further comprises logical 
object hierarchy organization to generate logical object hierarchy descriptions of said 
image object descriptions that are based on semantic characteristics of said objects, such 
that said image object hierarchy descriptions comprise physical and logical descriptions 
[col 2, lines 5-10] 
Claims 7 and 23: 

Pat '566 discloses image segmentation processing to segment each image in said image 
information into regions within said image and (b) feature extraction processing to 
generate object descriptions for one or more of said region, and wherein said physical 
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hierarchy organization and said logical hierarchy generate hierarchy descriptions of said 
object descriptions for said one or more of said regions [col 2, lines 5-10] 
Claims 8 and 24: 

Pat '566 discloses further comprising an encoder receiving said image object hierarchy 
descriptions and said image object descriptions, and encoding said image object 
hierarchy descriptions and said image object descriptions into encoded descriptions 
information, wherein said data storage system is operative to store said encoded 
description information as said at least one description record [Fig 4, 390] 
Claims 9. 25 and 38: 

Pat '566 discloses wherein said multimedia information comprises video information, 
said multimedia object descriptions comprise video object descriptions including both 
event descriptions and object descriptions, and said multimedia hierarchy descriptions 
comprise video object hierarchy descriptions including both event hierarchy 
descriptions and object hierarchy descriptions [col 1, lines 30-40] 
Claims 10 and 26: 

Pat '566 discloses (a) temporal video segmentation processing to temporally segment 
said video information into one or more video events or groups of video events and 
generate event descriptions for said video events, (b) video object extraction processing 
to segment said one or more video events or groups of video events into one or more 
regions, and to generate object descriptions for said regions; and (c) feature extraction 
processing to generate one or more event feature descriptions for said one or more video 
events or groups of video events, and one or more object feature descriptions for said 
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one or more regions; wherein said generated video object descriptions include said 
event feature descriptions and said object descriptions [col 3, lines 30-40] 
Claims 11.27 and 39: 

Pat '566 discloses v^herein said one or more event feature descriptions are selected from 
the group consisting of text annotations, shot transition, camera motion, time and key 
frame, and wherein said one or more object feature descriptions are selected from the 
group consisting of color, texture, shape, size, position, motion, and time [col 3, lines 
27, 28, col 3, lines 30-35] 
Claims 12. 28 and 40: 

Pat '566 discloses wherein said object hierarchy processing comprises physical event 
hierarchy organization to generate physical event hierarchy descriptions of said video 
object descriptions that are based on temporal characteristics of said video objects, such 
that said video hierarchy descriptions comprise temporal descriptions [col 3, lines 15- 
45] 

Claims 13.29.41 and 43: 

Pat '566 discloses wherein said object hierarchy processing further comprises logical 
event hierarchy organization to generate logical event hierarchy descriptions of said 
video object descriptions that are based on semantic characteristics of said video 
objects, such that said hierarchy descriptions comprise both temporal and logical 
descriptions [col 4, lines 1-15] 
Claims 14. 30 and 42: 
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Pat '566 discloses wherein said object hierarchy processing further comprises physical 
and logical object hierarchy extraction processing, receiving said temporal and logical 
descriptions and generating object hierarchy descriptions for video objects embedded 
within said video information, such that said video hierarchy descriptions comprise 
temporal and logical event and object descriptions [col 3, line 15 through col 4, line 15] 
Claims 15 and 31: 

Pat '566 discloses (a) temporal video segmentation processing to temporally segment 
said video information into one or more video events or groups of video events and 
generate event descriptions for said video events, (b) video object extraction processing 
to segment said one or more video events or groups of video events into one or more 
regions, and to generate object descriptions for said regions', and (c) feature extraction 
processing to generate one or more event feature descriptions for said one or more video 
events or groups of video events, and one or more object feature descriptions for said 
one or more regions; wherein said generated video object descriptions include said 
event feature descriptions and said object descriptions, and wherein said physical event 
hierarchy organization and said logical event hierarchy organization generate hierarchy 
descriptions from said event feature descriptions, and wherein said physical object 
hierarchy organization and said logical object hierarchy organization generate hierarchy 
descriptions from said object feature descriptions [col 3, line 15 through col 4, line 15] 
Claims 16 and 32: 

Pat '566 discloses an encoder receiving said video object hierarchy descriptions and 
said video object descriptions, and encoding said video object hierarchy descriptions 
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and said video object descriptions into encoded description information, wherein said 
data storage system is operative to store said encoded description information as said at 
least one description record [col 3, line 15 through col 4, line 15] 

(10) Response to Argument 

Appellant Argues: 

Appellant states on page 10 that Eleftheriadis does not disclose the claim 1 
limitation "processing said multimedia information by performing object 
extraction processing to generate multimedia object descriptions." 
Examiner Responds: 

Examiner is not persuaded. Regarding "object description/' Appellant 
does not provide a specific and deliberate definition of same. A common 
dictionary* definition of describe is "to tell or write about." The following 
disclosure by Eleftheriadis is in line with above definition. 

Eleftheriadis discloses in column 1, lines 60-67, the following: 

The invention overcoming these and other problems in the art relates to a system, method 
and associated medium for processing object-based audiovisual information which 
encodes, stores and retrieves not just overall firames, but individual segments containing 
AV objects which are then assembled into a scene according to embedded file 
information. The invention consequently provides very efficient streaming of and random 
access to component AV objects for even complex scenes. 



* Webster's New World College Dictionary, Fourth Edition 
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The above teaching that individual segments containing audio-visual 
objects can be processed, encoded, stored and retrieved reads on the claim 
limitation "processing said multimedia information by performing object 
extraction processing to generate multimedia object descriptions " 

Furthermore, Eleftheriadis discloses in column2, lines 30-35 the 
following: 

The AV (audio-visual) objects making up a scene are separately encoded 
and stored in file segments, and composition data for composing scenes out of 
those constituent objects is separately stored and can be randomly accessed and 
readily edited as v^ell. Moreover the invention is capable of processing MPEG-1, 
MPEG-2, audio, video and systems data files, along v^ith coded MPEG-4 data 
with its extended capabilities. 

The above teaching that (1) audio-visual objects are separately coded and 
stored in file segments and (2) constituent objects can be randomly accessed and 
readily edited (emphasis added) reads on the claim limitation "processing said 
multimedia information by performing object extraction processing to generate 
multimedia object descriptions." 

Furthermore, Eleftheriadis discloses in column 5, lines 10-13 that object 
IDs are used to uniquely identify the AV (audio-visual) objects encapsulated in 
AL PDUs 60, including the BIFS (binary format scene description information). 
The above teaching that object IDs are used to uniquely identify the AV (audio- 
visual) objects reads on the claim element "processing said multimedia 
information by performing object extraction processing to generate multimedia 
object descriptions." 
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Still further, Eleftheriadis discloses in column 5, lines 27-30 a one-byte 
Profile field 460 (Figure 1) containing profile/level descriptions for each AV 
Object present in the file. The above teaching of a one-byte profile field 
containing profile/level descriptions for each AV object is proof that Eleftheriadis 
anticipates the claim limitation "processing said multimedia information by 
performing object extraction processing to generate multimedia object 
descriptions." 

Appellant Argues: 

Appellant states on page 1 0 that Eleftheriadis does not disclose 
"processing said generated multimedia object descriptions by object hierarchy 
processing to generate multimedia object hierarchy descriptions." 
Examiner Responds: 

Examiner is not persuaded for the following reasons. 

It will be productive to interpret above claim limitations with respect to 
the specification of instant application. 

In Brief Description of the Drawings on Page 8, the specification states: 

Figs 6a and 6b are illustrative diagrams showing a set of video events and an 
exemplary hierarchal organization for the exemplary video objects shown in 
Figures. 

Page 21, fourth paragraph states: 

Nine exemplary video events are shown in Fig. 5, including the entire video 
sequence 500, a scene where the tiger is stalking the prey 510, a scene where the 
tiger is chasing its prey 520, a scene where the tiger catches its prey 530 and a 
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scene where the tiger is feeding 540. The later scene includes two events, one 
where the tiger is holding the food 550, and the second where the tiger is feeding 
the young 560. These video events, which are parallel to image objects may be 
expressed as a set of events 0, 1, 2, 3, 4, 5, 6 as shovm in Fig. 6a with the entire 
video sequence being event 0, the scene where the tiger is stalking the prey 510 
being event 1 , the scene where the tiger is chasing its prey being event 2, the 
scene where the tiger catches its prey 530 being event 4, the scene where the tiger 
is feeding 540 being event 4^ the scene where the tiger is hiding the food 550 
being event 5, and the scene where the tiger is feeding the young 560 being event 
6. 



Examiner concludes the following based on the above excerpts from the 
specification: 

(1) The tree structure of Fig. 6b is an hierarchal organization of objects. The 
hierarchal organization of objects is plain and simply the sequence of video events, i.e., 
event 0 is the entire video sequence followed by events 1, 2, 3, 4 and then events 5 and 
6. 

(2) There is no difference between, a scene, an event and an image object. 



Considering the teachings of the prior art made of record, Eleftheriadis discloses 
in column 3, lines 30-45: 

Individual components of a scene are coded as independent objects (e.g. 
arbitrarily shaped visual objects, or separately coded sounds). The audiovisual 
objects are transmitted to a receiving terminal along with scene description 
information which defines how the objects should be positioned in space and time 
in order to construct the scene to be presented to the user. The scene description 
follows a tree structured approach, similar to the Virtual Reality modeling 
Language (VRML) known in the art. The encoding of such scene description 



^ Examiner notes that there are two (emphasis added) events numbered 4 

^ Examiner is confused by event 0 which is the entire sequence of events 1-6. One of ordinary skill in the art would 
not understand why the entire sequence must be sent as event 0 and then send the entire sequence a second time but 
in the second transmittal the sequence is individually numbered. 
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information is more fully defined in Part 1 of the official ISO MPEG-4 
specification (MPEG-4) Systems), known in the art. BIFS information is 
transmitted in its own elementary stream, with its own time and clock stamp 
information to ensure proper coordination of events at the receiving terminal. 

The claimed "multimedia object descriptions" are anticipated by 
"individual components of a scene coded as independent objects" per the above 
teaching by Eleftheriadis. 

Firstly, the claimed "multimedia object hierarchy descriptions" are 
anticipated by the scene description which follows a tree-structure approach for 
defining how the objects should be positioned in space and time. Examiner notes 
per Fig. 6b of the specification of instant application and fiirther as noted above, a 
tree-structure representation of coded independent objects is the same as the 
claimed "object hierarchy descriptions." 

Secondly, the claimed "multimedia object hierarchy descriptions" are 
anticipated by BIFS (binary format scene) information, as best examiner is able to 
ascertain from the specification. Eleftheriadis discloses hierarchal organization 
such as a tree-structure of events comprising coded independent objects. Event 0 
is interpreted as the BIFS information transmitted in its own elementary stream 
and the sequencing of the events, i.e., such as events 1-6 are interpreted as the 
time and clock stamp information which ensures proper coordination of events at 
the receiving terminal. 

Examiner has justifiably shown, as best examiner is able to ascertain 
considering above inconsistencies in the specification that Eleftheriadis 
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anticipates the claim language "object hierarchy processing to generate 
muhimedia object descriptions." 

^ (11) Related Proceeding(s) Appendix 

No decision rendered by a court or the Board is identified by the examiner in the 
Related Appeals and Interferences section of this examiner's answer. 

For the above reasons, it is believed that the rejections should be sustained. 

Respectfully submitted, 



Conferees: 
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